边界感知引导多层级特征的知识蒸馏交通场景语义分割算法

doi:10.16451/j.cnki.issn1003-6059.202409002

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (7543 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要针对交通场景目标细节信息丢失与模型参数量过大等问题,提出边界感知引导多层级特征的知识蒸馏交通场景语义分割算法,以较少的参数量平滑目标分割边界.首先,构建自适应融合多层级特征模块,融合深层语义信息和浅层空间信息的多层级特征,选择性地突出目标边界信息和目标主体信息.然后,提出交互注意力融合模块,建模空间维度和通道维度的长距离依赖关系,增强不同维度间的信息交互能力.最后,提出基于候选边界的边界损失函数,构建基于细节感知的边界知识蒸馏网络,迁移复杂教师网络中的边界信息.在交通场景数据集Cityscapes和CamVid上的实验表明,文中算法能在实现轻量化的同时保持良好的分割性能,并在处理小目标和细长条目标时具有一定优势.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	谢新林
	段泽云
	罗臣彦
	谢刚

关键词 ：语义分割, 深度学习, 知识蒸馏, 交通场景, 注意力机制

Abstract：To solve the problems of object detail information loss and large model parameters in traffic scenes, a traffic scene semantic segmentation algorithm with knowledge distillation of multi-level features guided by boundary perception is proposed. The proposed algorithm can smooth the object segmentation boundaries with fewer parameters. First, the adaptive fusing multi-level feature module is constructed to integrate the multi-level features of deep semantic information and shallow spatial information. The object boundary information and object subject information are highlighted selectively. Second, an interactive attention fusion module is proposed to model the long-range dependencies in spatial and channel dimensions, enhancing the information interaction capabilities between different dimensions. Finally, a boundary loss function based on candidate boundaries is proposed to construct a boundary knowledge distillation network based on detail awareness and transfer boundary information from complex teacher networks. Experiments on the traffic scene datasets Cityscapes and CamVid demonstrate that the proposed algorithm achieves a lightweight model while gaining positive segmentation performance, maintaining significant advantages in dealing with small and slender objects.

Key words： Key Words Semantic Segmentation Deep Learning Knowledge Distillation Traffic Scene Attention Mechanism

收稿日期: 2024-07-31

ZTFLH:

TP 391.4

基金资助:国家自然科学基金项目(No.62006169)、山西省重点研发计划项目(No.202202010101005)、山西省基础研究计划面上项目(No.202303021221141)、太原市关键核心技术攻关“揭榜挂帅”项目(No.2024TYJB0137)资助

通讯作者: 谢新林,博士,副教授,主要研究方向为图像语义分割、深度学习等.E-mail:xiexinlin@tyust.edu.cn.

作者简介: 段泽云,硕士研究生,主要研究方向为深度学习、图像语义分割等.E-mail:s202315210574@stu.tyust.edu.cn.罗臣彦,硕士研究生,主要研究方向为深度学习、图像语义分割等.E-mail:S20201503009@stu.tyust.edu.cn.谢刚,博士,教授,主要研究方向为先进控制、机器视觉、故障诊断等.E-mail:xiegang@tyust.edu.cn.

引用本文:

谢新林, 段泽云, 罗臣彦, 谢刚. 边界感知引导多层级特征的知识蒸馏交通场景语义分割算法[J]. 模式识别与人工智能, 2024, 37(9): 770-785. XIE Xinlin, DUAN Zeyun, LUO Chenyan, XIE Gang. Traffic Scene Semantic Segmentation Algorithm with Knowledge Distillation of Multi-level Features Guided by Boundary Perception. Pattern Recognition and Artificial Intelligence, 2024, 37(9): 770-785.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202409002 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2024/V37/I9/770

[1] PAN H H, HONG Y D, SUN W C, et al. Deep Dual-Resolution Networks for Real-Time and Accurate Semantic Segmentation of Tra-ffic Scenes. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(3): 3448-3460.
[2] HU X G, JING L Y, SEHAR U.Joint Pyramid Attention Network for Real-Time Semantic Segmentation of Urban Scenes. Applied Intelligence, 2022, 52(1): 580-594.
[3] 蒋斌,涂文轩,杨超,等.基于DenseNet的复杂交通场景语义分割方法.模式识别与人工智能, 2019, 32(5): 472-480.
(JIANG B, TU W X, YANG C, et al. Semantic Segmentation Method for Complex Traffic Scene Based on DenseNet. Pattern Re-cognition and Artificial Intelligence, 2019, 32(5): 472-480.)
[4] XIAO X Y, ZHAO Y Q, ZHANG F, et al. BASeg: Boundary Aware Semantic Segmentation for Autonomous Driving. Neural Networks, 2023, 157: 460-470.
[5] GAO G W, XU G A, LI J C.FBSNet: A Fast Bilateral Symmetrical Network for Real-Time Semantic Segmentation. IEEE Transactions on Multimedia, 2023, 25: 3273-3283.
[6] 张墺琦,亢宇鑫,武卓越,等.基于多尺度特征和注意力机制的肝脏组织病理图像语义分割网络.模式识别与人工智能, 2021, 34(4): 375-384.
(ZHANG A Q, KANG Y X, WU Z Y, et al. Semantic Segmentation Network of Pathological Images of Liver Tissue Based on the Multi-scale Feature and Attention Mechanism. Pattern Recognition and Artificial Intelligence, 2021, 34(4): 375-384.)
[7] FU J, LIU J, TIAN H J, et al. Dual Attention Network for Scene Segmentation//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 3141-3149.
[8] DAI Y M, GIESEKE F, OEHMCKE S, et al. Attentional Feature Fusion//Proc of the IEEE Winter Conference on Applications of Computer Vision. Washington, USA: IEEE, 2021: 3559-3568.
[9] ZHOU Z, ZHOU Y, WANG D L, et al. Self-Attention Feature Fusion Network for Semantic Segmentation. Neurocomputing, 2021, 453: 50-59.
[10] YANG C G, ZHOU H L, AN Z L, et al. Cross-Image Relational Knowledge Distillation for Semantic Segmentation//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 12309-12318.
[11] JI D Y, WANG H R, TAO M Y.Structural and Statistical Texture Knowledge Distillation for Semantic Segmentation//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 16855-16864.
[12] AN S M, LIAO Q M, LU Z Q, et al. Efficient Semantic Segmentation via Self-Attention and Self-Distillation. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(9): 15256-15266.
[13] HAN H Y, CHEN Y C, HSIAO P Y, et al. Using Channel-Wise Attention for Deep CNN Based Real-Time Semantic Segmentation with Class-Aware Edge Information. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(2): 1041-1051.
[14] GOU J P, YU B S, MAYBANK S J, et al. Knowledge Distillation: A Survey. International Journal of Computer Vision, 2021, 129(6): 1789-1819.
[15] HINTON G, ORIOL V, DEAN J.Distilling the Knowledge in a Neural Network[C/OL].[2024-06-20].https://arxiv.org/pdf/1503.02531.
[16] ZHANG Y, XIANG T, HOSPEDALES T M, et al. Deep Mutual Learning//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 4320-4328.
[17] ZHANG H R, HU Z Z, QIN W, et al. Adversarial Co-distillation Learning for Image Recognition. Pattern Recognition, 2021, 111. DOI: 10.1016/j.patcog.2020.107659.
[18] HOU Y N, MA Z, LIU C X, et al. Learning Lightweight Lane De-tection CNNs by Self-Attention Distillation//Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 1013-1021.
[19] LIU Y A, ZHANG W, WANG J.Adaptive Multi-teacher Multi-level Knowledge Distillation. Neurocomputing, 2020, 415: 106-113.
[20] WANG Y K, ZHOU W, JIANG T.Intra-class Feature Variation Distillation for Semantic Segmentation//Proc of the 16th European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 346-362.
[21] FENG Y C, SUN X, DIAO W H, et al. Double Similarity Disti-llation for Semantic Image Segmentation. IEEE Transactions on Image Processing, 2021, 30: 5363-5376.
[22] SHU C Y, LIU Y F, GAO J F.Channel-Wise Knowledge Disti-llation for Dense Prediction//Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 5291-5300.
[23] DING X H, ZHANG X Y, MA N N, et al. RepVGG: Making VGG-Style ConvNets Great Again//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 13728-13737.
[24] HE K M, ZHANG X Y, REN S Q, et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification//Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 1026-1034.
[25] YU C Q, WANG J B, PENG C,et al. BiSeNet: Bilateral Segmentation Network for Real-Time Semantic Segmentation//Proc of the European Conference on Computer Vision. Berlin, Germany: Sprin-ger, 2018: 334-349.
[26] 谢新林,罗臣彦,续欣莹,等.双注意力引导的跨层优化交通场景语义分割.交通运输系统工程与信息, 2023, 23(1): 236-244.
(XIE X L, LUO C Y, XU X Y,et al. Dual Attention Guided Cross-Layer Optimized Traffic Scene Semantic Segmentation. Journal of Transportation Systems Engineering and Information Techno-logy, 2023, 23(1): 236-244.)
[27] BADRINARAYANAN V, KENDALL A, CIPOLLA R.SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
[28] ZHAO H S, SHI J P, QI X J, et al. Pyramid Scene Parsing Network//Proc of the IEEE Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2017: 6230-6239.
[29] ZHAO H S, QI X J, SHEN X Y,et al. ICNet for Real-Time Semantic Segmentation on High-Resolution Images//Proc of the European Conference on Computer Vision. Berlin, Germany: Sprin-ger, 2018: 418-434.
[30] ORSIC M, KRESO I, BEVANDIC P, et al. In Defense of Pre-trained ImageNet Architectures for Real-Time Semantic Segmentation of Road-Driving Images//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 12599-12608.
[31] CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-Deco-der with Atrous Separable Convolution for Semantic Image Segmentation//Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 833-851.
[32] YANG L, BAI Y W, REN F L, et al. LCFNets: Compensation Strategy for Real-Time Semantic Segmentation of Autonomous Dri-ving. IEEE Transactions on Intelligent Vehicles, 2024, 9(4): 4715-4729.
[33] KARINE A, NAPOLÉON T, JRIDI M. Channel-Spatial Knowle-dge Distillation for Efficient Semantic Segmentation. Pattern Re-cognition Letters, 2024, 180: 48-54.
[34] WU Y, JIANG J Y, HUANG Z M, et al. FPANet: Feature Pyramid Aggregation Network for Real-Time Semantic Segmentation. Applied Intelligence, 2022, 52(3): 3319-3336.